94 research outputs found
Medical WordNet: A new methodology for the construction and validation of information resources for consumer health
A consumer health information system must be able to comprehend both expert and non-expert medical vocabulary and to map between the two. We describe an ongoing
project to create a new lexical database called Medical WordNet (MWN), consisting of
medically relevant terms used by and intelligible to non-expert subjects and supplemented by a corpus of natural-language sentences that is designed to provide
medically validated contexts for MWN terms. The corpus derives primarily from online health information sources targeted to consumers, and involves two sub-corpora, called Medical FactNet (MFN) and Medical BeliefNet (MBN), respectively. The former consists of statements accredited as true on the basis of a rigorous process of validation, the latter of statements which non-experts believe to be true. We summarize the MWN / MFN / MBN project, and describe some of its applications
WordNet: An Electronic Lexical Reference System Based on Theories of Lexical Memory
Cet article fait la description de WordNet, système de référence électronique, dont le dessin est basé sur des théories psycholinguistiques concernant la mémoire lexicale et l’organisation mentale des mots.Les noms, les verbes et les adjectifs anglais sont organisés en groupes synonymes (les « synsets »), chacun représentant un concept lexical. Trois relations principales — l’hyponymie, la méronymie et l’antonymie — servent à établir les rapports conceptuels entre les « synsets ». Les présuppositions qui lient les verbes sont indiquées ainsi que leurs contextes syntaxiques et sémantiques.En tâchant de miroiter l’organisation mentale des concepts lexicaux, WordNet pourrait servir l’utilisateur sans formation en linguistique.This paper describes WordNet, an on-line lexical reference system whose design is based on psycholinguistic theories of human lexical organization and memory.English nouns, verbs, and adjectives are organized into synonym sets, each representing one underlying lexical concept. Synonym sets are then related via three principal conceptual relations: hyponymy, meronymy, and antonymy. Verbs are additionally specified for presupposition relations that hold among them, and for their most common semantic/syntactic frames.By attempting to mirror the organization of the mental lexicon, WordNet strives to serve the linguistically unsophisticated user
MABEL: Attenuating Gender Bias using Textual Entailment Data
Pre-trained language models encode undesirable social biases, which are
further exacerbated in downstream use. To this end, we propose MABEL (a Method
for Attenuating Gender Bias using Entailment Labels), an intermediate
pre-training approach for mitigating gender bias in contextualized
representations. Key to our approach is the use of a contrastive learning
objective on counterfactually augmented, gender-balanced entailment pairs from
natural language inference (NLI) datasets. We also introduce an alignment
regularizer that pulls identical entailment pairs along opposite gender
directions closer. We extensively evaluate our approach on intrinsic and
extrinsic metrics, and show that MABEL outperforms previous task-agnostic
debiasing approaches in terms of fairness. It also preserves task performance
after fine-tuning on downstream tasks. Together, these findings demonstrate the
suitability of NLI data as an effective means of bias mitigation, as opposed to
only using unlabeled sentences in the literature. Finally, we identify that
existing approaches often use evaluation settings that are insufficient or
inconsistent. We make an effort to reproduce and compare previous methods, and
call for unifying the evaluation settings across gender debiasing methods for
better future comparison.Comment: Accepted to EMNLP 2022. Code and models are publicly available at
https://github.com/princeton-nlp/mabe
Towards Foundational Semantics - Ontological Semantics Revisited -
Cimiano P, Reyle U. Towards Foundational Semantics - Ontological Semantics Revisited -. In: Bennett B, Fellbaum C, eds. Formal Ontology in Information Systems, Proceedings of the Fourth International Conference, FOIS 2006. Frontiers in Artificial Intelligence and Applications, 150. IOS Press; 2006: 51-62
Publishing and Linking WordNet using lemon and RDF
McCrae J, Fellbaum C, Cimiano P. Publishing and Linking WordNet using lemon and RDF. In: Proceedings of the 3rd Workshop on Linked Data in Linguistics. 2014
The Compositional Nature of Verb and Argument Representations in the Human Brain
How does the human brain represent simple compositions of objects, actors,and
actions? We had subjects view action sequence videos during neuroimaging (fMRI)
sessions and identified lexical descriptions of those videos by decoding (SVM)
the brain representations based only on their fMRI activation patterns. As a
precursor to this result, we had demonstrated that we could reliably and with
high probability decode action labels corresponding to one of six action videos
(dig, walk, etc.), again while subjects viewed the action sequence during
scanning (fMRI). This result was replicated at two different brain imaging
sites with common protocols but different subjects, showing common brain areas,
including areas known for episodic memory (PHG, MTL, high level visual
pathways, etc.,i.e. the 'what' and 'where' systems, and TPJ, i.e. 'theory of
mind'). Given these results, we were also able to successfully show a key
aspect of language compositionality based on simultaneous decoding of object
class and actor identity. Finally, combining these novel steps in 'brain
reading' allowed us to accurately estimate brain representations supporting
compositional decoding of a complex event composed of an actor, a verb, a
direction, and an object.Comment: 11 pages, 6 figure
SemEval-2010 Task 17: All-words Word Sense Disambiguation on a Specific Domain
Domain portability and adaptation of NLP components and Word Sense Disambiguation systems present new challenges. The difficulties found by supervised systems to adapt might change the way we assess the strengths and weaknesses of supervised and knowledge-based WSD systems. Unfortunately, all existing evaluation datasets for specific domains are lexical-sample corpora. This task presented all-words datasets on the environment domain for WSD in four languages (Chinese, Dutch, English, Italian). 11 teams participated, with supervised and knowledge-based systems, mainly in the English dataset. The results show that in all languages the participants where able to beat the most frequent sense heuristic as estimated from general corpora. The most successful approaches used some sort of supervision in the form of hand-tagged examples from the domain
KYOTO: A System for Mining, Structuring, and Distributing Knowledge Across Languages and Cultures
We outline work performed within the framework of a current EC project. The goal is to construct a language-independent information system for a specific domain (environment/ecology/biodiversity) anchored in a language-independent ontology that is linked to wordnets in seven languages. For each language, information extraction and identification of lexicalized concepts with ontological entries is carried out by text miners (?Kybots?). The mapping of language-specific lexemes to the ontology allows for crosslinguistic identification and translation of equivalent terms. The infrastructure developed within this project enables long-range knowledge sharing and transfer across many languages and cultures, addressing the need for global and uniform transition of knowledge beyond the specific domains addressed here
- …